An artificial neural network model for evaluating the risk of hyperuricaemia in type 2 diabetes mellitus

Type 2 diabetes with hyperuricaemia may lead to gout, kidney damage, hypertension, coronary heart disease, etc., further aggravating the condition of diabetes as well as adding to the medical and financial burden. To construct a risk model for hyperuricaemia in patients with type 2 diabetes mellitus based on artificial neural network, and to evaluate the effectiveness of the risk model to provide directions for the prevention and control of the disease in this population. From June to December 2022, 8243 patients with type 2 diabetes were recruited from six community service centers for questionnaire and physical examination. Secondly, the collected data were used to select suitable variables and based on the comparison results, logistic regression was used to screen the variable characteristics. Finally, three risk models for evaluating the risk of hyperuricaemia in type 2 diabetes mellitus were developed using an artificial neural network algorithm and evaluated for performance. A total of eleven factors affecting the development of hyperuricaemia in patients with type 2 diabetes mellitus in this study, including gender, waist circumference, diabetes medication use, diastolic blood pressure, γ-glutamyl transferase, blood urea nitrogen, triglycerides, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, fasting glucose and estimated glomerular filtration rate. Among the generated models, baseline & biochemical risk model had the best performance with cutoff, area under the curve, accuracy, recall, specificity, positive likelihood ratio, negative likelihood ratio, precision, negative predictive value, KAPPA and F1-score were 0.488, 0.744, 0.689, 0.625, 0.749, 2.489, 0.501, 0.697, 0.684, 0.375 and 0.659. In addition, its Brier score was 0.169 and the calibration curve also showed good agreement between fitting and observation. The constructed artificial neural network model has better efficacy and facilitates the reduction of the harm caused by type 2 diabetes mellitus combined with hyperuricaemia.

Type 2 diabetes is a chronic metabolic disease caused by a combination of genetic, dietary and environmental factors.The disease is characterised by insufficient insulin secretion or an inability to utilise insulin efficiently, resulting in persistent elevation of blood glucose 1 .According to a report by the World Health Organization (WHO), diabetes is the direct cause of 1.5 million deaths in 2019, with 48% of diabetes deaths occurring before the age of 70 2 .According to the data released by the International Diabetes Federation (IDF) in 2021, there are 140 million people with diabetes in China, with a prevalence rate of approximately 10.6%, and both the number of people with the disease and the prevalence rate are on the rise, with type 2 diabetes accounting for more than 90% of the Chinese diabetic population 3 .In addition, diabetes mellitus may lead to various complications, such as blindness 4 , kidney failure 5 and hypertension 6 , due to factors such as poor blood sugar control over a long period of time.
Hyperuricaemia is the greatest risk factor for gout 7 and is mainly due to excessive production or poor excretion of uric acid, the main source of which is purines 8 .Past studies have shown that hyperuricaemia is a risk factor for diabetes mellitus, cardiovascular disease, metabolic syndrome and other diseases [9][10][11] .A meta-analysis showed that the prevalence of hyperuricaemia in the Chinese population was 16.4% 12 .
Studies have shown that type 2 diabetes and hyperuricaemia can interact.On the one hand, people with type 2 diabetes are often insulin resistant, which may lead to increased tubular reabsorption of uric acid, which can lead to hyperuricaemia 13,14 .On the other hand, epidemiological studies have shown that hyperuricaemia is a risk factor for insulin resistance, prediabetes and diabetes 9,11,15 .In addition, recent evidence suggests that high levels of uric acid interfere with insulin signalling in endothelial cells at both the receptor and post-receptor levels, and that at the post-receptor level, both proximal (IRS and PI3K-Akt components) and distal (eNOS-NO system) steps of the insulin signalling pathway are affected by uric acid 16 .Risk predictors of high uric acid levels in patients with type 2 diabetes mellitus have been explored, including hip circumference, total cholesterol, high-density lipoprotein, etc. 17,18 .
Previous studies have applied the Cox regression model and machine learning methods to build a risk model of hyperuricemia based on sociodemographic data, routine physical examination markers, dietary risk factors, blood biomarkers, and alterations of the gut microbiome [19][20][21][22][23][24] .However, these established studies have only modelled hyperuricaemia in healthy population, while there was a study exploring the development of a hyperuricemia risk model in diabetic kidney disease patients 25 .To the best of our knowledge, only one study has established a predictive model of hyperuricaemia in the type 2 diabetic population 26 .
In recent years, Artificial neural networks (ANN) have become popular and useful models for classification, clustering, pattern recognition and prediction in many disciplines.It has a fast and wide range of uses in dealing with a variety of complex real-world problems 27 .The popularity of ANN lies in its information processing characteristics, including learning ability, high parallelism, fault tolerance, nonlinearity, noise tolerance and generalisation 27,28 .Also, Dalakleidi K et al. 29 showed, ANN is superior to other machine learning algorithms.Although, ANN advantages are obvious, the previous did not use ANN algorithm to model the risk of hyperuricaemia risk factors.
In this study, we constructed a risk model for hyperuricaemia in patients with type 2 diabetes mellitus based on ANN algorithm, and assessed the validity of the model.This has an important role in clinically distinguishing high-risk individuals and identifying risk factors, which in turn has far-reaching significance in alleviating disease symptoms, reducing the risk of patient death and reducing the healthcare burden.

Study participants
This was a retrospective cross-sectional survey.Between June and December 2022, we randomly recruited patients with type 2 diabetes from one community in each of the six urban areas of Fuzhou City.All participants underwent a face-to-face survey using a homemade uniform questionnaire and took a physical examination, which were both conducted by trained primary care professionals.
Patients with malignancy, history of gout, hyperuricaemia occurring before type 2 diabetes mellitus, type 1 diabetes mellitus, gestational diabetes mellitus and other specific diabetes mellitus were not included in this study.After exclusion of incomplete physical examination data, a total of 8243 cases were obtained.
All respondents completed an informed consent form the ethical research board committee of Fuzhou Center for Disease Control and Prevention (approval number: 2022002) approved the research.In addition, all participants and/or their legal guardians consented to use their medical data in this study.This study was carried out following the Helsinki Declaration contents.

Data measurements
Basic personal information and medical history were investigated through questionnaires, including gender, age, history of smoking and alcohol consumption, duration of diabetes, medication history, etc.

Statistical methods
Data were double entered using EpiData (version 3.1) and analysed using IBM SPSS (version 22.0) and RStudio (version 4.2.3);measured data conforming to a normal distribution were expressed as ( x±s) and compared between groups by a t test, and count data were compared by a chi-square test.Univariate and multivariate logistic regression analyses were conducted using uric acid levels as a dependent variable and sociodemographic characteristics and physiological and biochemical indicators as independent variables, with variables introduced and excluded at a test level of 0.05.The variance inflation factor (VIF) was used to examine collinearity among the independent variables included in the multivariate logistic regression analysis in this study.Data management and statistical analysis were conducted using R version 4.3.2.

Development and validation of the classification models
We utilized multivariable stepwise logistic regression analysis for variable selection.The ANN algorithm was used to build models for three different data scenarios (baseline data only, biochemical indicators only, and baseline data and biochemical indicators).The incorporated data were divided into a training-testing set (80%) and an independent validation set (20%) using stratified sampling.We utilized grid search to search the hyperparameter space efficiently.This allowed us to find the optimal combination of hyperparameters for three ANN models.To avoid overfitting and promote the models, we used a tenfold cross-validation for the training-testing set and referenced the best models to the independent validation set.
The areas under curves (AUCs) of the three ANN models in the training-testing set were evaluated to assess model performance.In addition, we calculated performance metrics including AUC, accuracy, recall, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), precision, negative predictive value (NPV), kappa, and F1-score.After a comparison of the above performance metrics, the constructed optimal ANN model is visible in Fig. 2. Finally, the calibration curve was analyzed to assess the agreement by the slope, intercept, and Brier score (an ideal value of 0; a value of > 0.3 indicates poor calibration) of the calibration curve.

Ethics approval and consent
This study was approved by the Ethics Committee of the Fuzhou Center for Disease Control and Prevention (approval number: 2022002).Informed consent was obtained from all participants and/or their legal guardians for this study.There is no conflict of interest in this study.

Demographic characteristics
A total of 8243 diabetic patients were investigated in this survey.Table 1 shows descriptive statistics of sample characteristics, including age, gender, tobacco use, drinking alcohol, sport, waist circumference, BMI, disease duration, diabetes medication use, systolic blood pressure and diastolic blood pressure.

Univariate and multivariate analyses of baseline information
Baseline data were included in separate univariate logistic regression analyses to screen for a total of eight variables: gender, tobacco use, alcohol use, exercise, waist circumference, BMI, diabetes medication use and DBP (P < 0.05).The VIF for these eight baseline variables are all less than 5, so there is no multicollinearity (Table 5).
Further inclusion in the multivariate logistic regression analysis revealed that gender, exercise, waist circumference, DBP and diabetic medication use were influential factors (P < 0.05).Details are presented in Table 2.

Univariate and multivariate analyses of biochemical indicators
Biochemical indicators were included in the univariate logistic regression analysis, and a total of nine variables, including GGT, TBil, BUN, TGs, TC, LDL-C, HDL-C, FPG and the eGFR, were screened (P < 0.05).There is no multicollinearity among nine biochemical indicators variables (all VIF < 5) (Table 5).Further inclusion in the multivariate logistic regression analysis revealed that GGT, TBil, BUN, TGs, TC, LDL-C, HDL-C, FPG and the eGFR were influential factors for hyperuricaemia (P < 0.05, Table 3).

Univariate and multivariate analyses of baseline & biochemical indicators
Baseline and biochemical indicators were included in the univariate logistic regression analysis, and a total of seventeen variables, including gender, tobacco use, alcohol use, exercise, waist circumference, BMI, diabetes medication use, DBP, GGT, TBil, BUN, TGs, TC, LDL-C, HDL-C, FPG and the eGFR, were screened (P < 0.05).

Discussion
Although ANN have been widely used in predictive modelling of diseases, however, as far as we know, no study has modelled the risk of hyperuricaemia in a large sample of type 2 diabetic population, as previous studies have only modelled and analysed type 2 diabetes 33,34 or hyperuricaemia 19 .After comparing the performances of the three models and after model validation, we confirmed that the baseline and biochemical model was the optimal model.Interestingly, we noted in our study that the model for baseline information was superior to We have long identified hyperuricaemia as the greatest risk factor for gout 7 , and men are usually more likely to develop hyperuricaemia than premenopausal women 35 .However, our study showed that the detection rate of hyperuricaemia was not only higher in women (33.86%) than in men (11.83%) in the type 2 diabetic population but also the risk of developing hyperuricaemia was 4.15 times higher (95%Cl 3.70-4.66)in women than in men; we also noted that the confidence interval for the mean age of women in this population was 67.16 ± 7.48 years (P < 0.001).This may be due to the decline in hormone levels in postmenopausal women, who lack the protection of, for example, progesterone 36 and oestrogen 37 .
Obesity has been shown to be associated with hyperuricaemia: firstly, obese individuals tend to have higher levels of uric acid compared to normal-weight individuals because of their higher urinary excretion and reduced clearance of uric acid 38 .Secondly, weight loss in obese individuals is accompanied by reduced uric acid levels and xanthine oxidoreductase (XOR) activity 39 ; XOR is responsible for the breakdown of hypoxanthine and xanthine into uric acid.Finally, animal experiments have shown that the underlying mechanism of elevated uric acid in obese adipose tissue may be due to dysregulation of adipocytokines and chronic low-grade inflammation 40,41 .Table 3. Univariate and multivariate logistic regression analysis of biochemical indicators.R: reference, ALT: alanine aminotransferase, GGT: γ-glutamyl transpeptidase, TBil: total bilirubin, BUN: blood urea nitrogen, TGs: triglycerides, TC: total cholesterol, LDL-C: low-density lipoprotein cholesterol, HDL-C: high-density lipoprotein cholesterol, FPG: fasting glucose, eGFR: estimated glomerular filtration rate.CI: confidence interval.The eGFR was selected in two similar variables, eGFR and SCr.www.nature.com/scientificreports/A meta-analysis 42 showed that sodium-glucose cotransporter 2 (SGLT-2) inhibitors might potentially prevent gout-related events in patients with type 2 diabetes mellitus, and recent studies 43,44 have shown a reduction in blood uric acid levels in diabetic patients on glucose-lowering drugs.This may be related to the renal protective effects of hypoglycaemic agents [45][46][47] , such as SGLT-2 inhibitors, which not only promote anti-inflammatory and antifibrotic pathways, improve renal oxygenation, and reduce glomerular hypertension and hyperfiltration but also reduce the renal hypoxia characteristic of diabetes, thus exerting effects similar to those of β-blockers in the heart.However, the results of this survey showed that not taking glucose-lowering medications was negatively associated with hyperuricaemia in this type 2 diabetic population; however, the specific names of the glucoselowering medications taken by this population were not available for this survey, and thus, further research is needed to confirm the results.
The mechanism of the blood pressure lowering effect on serum uric acid reduction is still under investigation.In a large trial of 10,617 hypertensive patients, therapeutic control of their blood pressure resulted in a significant reduction in the prevalence of hyperuricaemia 48 , similar to the present investigation.However, it has also been shown that appropriate systemic blood pressure control may lead to increased uric acid excretion through modulation of glomerular and tubular function, which in turn reduces serum uric acid and may ameliorate various forms of renal damage in the long term 49,50 .
Previous studies have shown that the kidneys eliminate 70% of uric acid daily 4 ; therefore, the functional status of the kidneys also influences the development and progression of hyperuricaemia.Similarly, and similar to previous studies 51,52 , a decrease in the eGFR is indicative of a decrease in renal function, which can lead to serum uric acid retention and thus increase the risk of developing hyperuricaemia 53 .In the type 2 diabetic population, eGFR < 60 mL/(min × 1.73 m 2 ) is generally defined as diabetic nephropathy (DN) 24 ; therefore, approximately 7.44% (n = 613) of the patients in this population may have had DN, and further deterioration may lead to end-stage renal disease 54 .Additionally, some studies 55,56 showed that type 2 diabetes mellitus combined with hyperuricaemia was associated with a higher risk of all-cause mortality and end-stage renal disease.Our model also suggests that when BUN is ≥ 7.5 mmol/L, this population is at increased risk of developing hyperuricaemia.For these reasons, emphasis should be placed on improving the screening and management of renal function in the type 2 diabetic population at an early stage.
When glycaemic control is poor in diabetic patients, uric acid levels are reduced owing to the permeability of glucose, causing increased excretion of urinary sugar, which in turn leads to competitive inhibition of uric acid reabsorption 57 , similar to the present findings.Recent studies have found that abnormal liver function is also a risk factor for the development of hyperuricaemia 58 , which may be related to the source of uric acid production.However, only elevated γ-glutamine transferase (GGT) was positively associated with hyperuricaemia in Table 6.Performance comparison of the three models developed.CI: confidence interval, AUC: area under the curve, PLR: positive likelihood ratio, NLR: negative likelihood ratio, NPV: negative predictive value.Abnormalities in TC, TGs, HDL-C or LDL-C are generally diagnosed as dyslipidaemia, and dyslipidaemia is increasingly shown to be a risk factor for many diseases [59][60][61] .Previous studies 62,63 have confirmed the positive correlation between TGs levels and hyperuricaemia, and Nakanishi et al. 64 found that basal TGs remained an independent predictor of new-onset hyperuricaemia even when long-term medicated patients with diabetes mellitus were excluded, which is consistent with our study.Some studies have also attempted to explain the mechanism of elevated TGs and hyperuricaemia; as TGs rise, the production and utilisation of free fatty acids in the body increases and the catabolism of adenosine triphosphate is accelerated, leading to an increase in uric acid production 65 .Our study found a positive association between elevated LDL-C and hyperuricaemia, which may be related to the role of LDL-C by inducing vascular inflammation, atherogenesis, calcification and thrombosis 66 .In agreement with Xu et al. 67 , low HDL-C levels can trigger hyperuricaemia.HDL-C has anti-inflammatory, antioxidant and anti-apoptotic effects 68 , and it has also been found that HDL-C reduces inflammation induced by urate crystals, suggesting that HDL-C is involved in uric acid-induced inflammatory responses 69 .

Clinical and public health potential
Our study identified a total of eleven factors affecting hyperuricaemia in the type 2 diabetes population, which could provide theoretical support in clinical decision-making and provide decision-making physicians with ideas for treating type 2 diabetes combined with hyperuricaemia.Meanwhile, in the health management of type 2 diabetes population, female type 2 diabetes patients should pay special attention to their uric acid level, and also strengthen the monitoring and management of risk factors such as abdominal obesity, elevated blood pressure, decreased liver and renal function, and dyslipidaemia, in order to the risk posed by type 2 diabetes mellitus combined with hyperuricaemia, which is of far-reaching significance for the prevention of progressive deterioration of the disease, the enhancement of the quality of life, and the reduction of medical costs.

Strengths and Limitations
The strength of this study lies in its cross-sectional design to explore the risk factors for hyperuricaemia in a type 2 diabetes mellitus population with a large sample size, as well as the model based on logistic regression and ANN algorithms that were developed and fully validated.Our study also has many shortcomings.Firstly, the AUC value of our established ANN model is not outstanding and fails to reach the desired level.Secondly, although our study implemented strict inclusion and exclusion criteria, based on the nature of cross-sectional studies, the causal argument in determining hyperuricaemia remains unclear, and for this reason, further prospective studies are needed to validate it.Furthermore, it is difficult to control various biases in the survey, so that the truthfulness of some of the data is unconvincing.Finally, our study didn't include enough variables to be explored, especially ignoring the effect of dietary factors on hyperuricaemia.Based on the above drawbacks, we will improve them in future studies to validate and refine the risk model.

Conclusion
The ANN model built in this study based on eleven variables performed well and can provide theoretical support for clinical decision-making and self-care of type 2 diabetes mellitus patients to mitigate the harm caused by type 2 diabetes mellitus combined with hyperuricaemia.

Table 2 .
Univariate and multivariate logistic regression analysis of baseline information.thatfor biochemical indicators.Baseline information is relatively stable and more reflective of the patient's true condition over a long period of time than biochemical indicators, which are one-off test results that only indicate the status of the patient's biochemical levels for the first day or two or the first few days of testing.In other words, in a cross-sectional study, baseline information may be more important and more reflective of the patient's true condition than biochemical indicators.Certainly, we will need to demonstrate this in future studies.

Table 5 .
Colinearity diagnostics of independent variables included in the above three multivariate logistic regression analysis in this study.
Figure 3. Areas under curves (AUCs) of the three ANN models developed.Vol:.(1234567890) 2 diabetic patients; thus, this investigation does not yet identify abnormal liver function as a risk factor for hyperuricaemia, and further studies are needed to support this.